Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Spark decimal multiply and divide #4613

Closed

Conversation

jinchengchenghh
Copy link
Contributor

@jinchengchenghh jinchengchenghh commented Apr 14, 2023

Use Arrow Gandiva BasicDecimal128 algorithm to compute value. Introduce intermediate int256_t value to avoid computation overflow.

Add dependency on Boost multiprecision to get access to #include <boost/multiprecision/cpp_int.hpp> that defines int256.

Arrow implementation:

Resolve: #4773

@netlify
Copy link

netlify bot commented Apr 14, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit fc5f92b
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/653f3e767470e40008f5d38a

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 14, 2023
"min(38, a_precision - a_scale + b_scale + max(6, a_scale + b_precision + 1))")
.integerVariable(
"r_scale",
"min(37, max(6, a_scale + b_precision + 1))") // if precision is
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, we get result type by function signature constraint, but spark decimal arithmetic is very complex, I'm not sure if we can represent the computation by string. If has, can anyone guide me?
Or can we send a function to signature, then compute the result type.
Now I compute the result type second times, it's not a good solution, and will cause function which revokes this function gets a wrong input argument type such as makeCheckOverflow.

And in Spark, the decimal arithmetic format is check_overflow(a * b), we will use the argument type in function apply, so it will be normal after this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have some suggestions for here? @mbasmanova
You could just review the function signature, other part will be refactored after #4434 merge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh

spark decimal arithmetic is very complex,

Would you share the details?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is spark decimal result type computation https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L604

I don't know how to express the if in string constraint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh Is allowPrecisionLoss a session property? I imagine we would need to define 2 separate functions, one for allowPrecisionLoss=true and another for allowPrecisionLoss=false, and choose the right kind based on the value of allowPrecisionLoss.

  // When `spark.sql.decimalOperations.allowPrecisionLoss` is set to true, if the precision / scale
  // needed are out of the range of available values, the scale is reduced up to 6, in order to
  // prevent the truncation of the integer part of the decimals.
  protected def allowPrecisionLoss: Boolean = SQLConf.get.decimalOperationsAllowPrecisionLoss

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is default true, current implementation is based on default value.

  val DECIMAL_OPERATIONS_ALLOW_PREC_LOSS =
    buildConf("spark.sql.decimalOperations.allowPrecisionLoss")
      .internal()
      .doc("When true (default), establishing the result type of an arithmetic operation " +
        "happens according to Hive behavior and SQL ANSI 2011 specification, i.e. rounding the " +
        "decimal part of the result if an exact representation is not possible. Otherwise, NULL " +
        "is returned in those cases, as previously.")
      .version("2.3.1")
      .booleanConf
      .createWithDefault(true)

allowPrecisionLoss is a rarely used config, someone else interested in it can add the new function for allowPrecisionLoss=false, maybe in another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have suggestions to generate the result type by function adjustPrecisionScale? I don't have much context for string constrait which then is computed by expression::calculation::evaluate. @mbasmanova

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh I'm not sure I understand this question. It might be better to create GitHub issue and move this discussion over there. Please, CC: @majetideepak and @karteekmurthys

@majetideepak
Copy link
Collaborator

@jinchengchenghh there is an overhaul of the Velox Type implementation.
See the discussion here #4069
As a result, the Decimal Type implementation will be changing with #4434
This work will have to be refactored to follow the new implementation.

@jinchengchenghh
Copy link
Contributor Author

@majetideepak OK, I will refactor after it merges

@jinchengchenghh
Copy link
Contributor Author

Looks like this boost version cannot include /usr/local/include/boost/multiprecision/cpp_int.hpp
I can build successfully in local with boost version Version: 1.71.0.0ubuntu2.
Can you guide me how to fix this problem in CI environment? @majetideepak

My build error is

velox/functions/sparksql/CMakeFiles/velox_functions_spark.dir/DecimalArithmetic.cpp.o -MF velox/functions/sparksql/CMakeFiles/velox_functions_spark.dir/DecimalArithmetic.cpp.o.d -o velox/functions/sparksql/CMakeFiles/velox_functions_spark.dir/DecimalArithmetic.cpp.o -c ../../velox/functions/sparksql/DecimalArithmetic.cpp
In file included from /usr/local/include/boost/multiprecision/detail/generic_interconvert.hpp:9,
                 from /usr/local/include/boost/multiprecision/number.hpp:26,
                 from /usr/local/include/boost/multiprecision/cpp_int.hpp:12,
                 from ../.././velox/functions/sparksql/DecimalUtil.h:19,
                 from ../../velox/functions/sparksql/DecimalArithmetic.cpp:20:
/usr/local/include/boost/multiprecision/detail/default_ops.hpp:910:17: error: 'iterator_range' in namespace 'boost::mpl' does not name a template type
  910 |    typedef mpl::iterator_range<start_seq, typename mpl::end<list_type>::type> range;
      |                 ^~~~~~~~~~~~~~
/usr/local/include/boost/multiprecision/detail/default_ops.hpp:914:17: error: template argument 1 is invalid
  914 |        pred_type>::type iter_type;
      |                 ^
/usr/local/include/boost/multiprecision/detail/default_ops.hpp:914:18: error: expected identifier before '::' token
  914 |        pred_type>::type iter_type;
      |                  ^~
/usr/local/include/boost/multiprecision/detail/default_ops.hpp:914:20: error: typedef name may not be a nested-name-specifier

@jinchengchenghh
Copy link
Contributor Author

My build command is make debug EXTRA_CMAKE_FLAGS="-DVELOX_ENABLE_ARROW=ON"

It succeeds in my local environment, but raise following exception in CI, can you help guide how to fix this issue? @mbasmanova

boost::mpl::list exists in boost/mpl/aux_/preprocessed/plain/list.hpp, but I don't think I can include it directly, because it does not have flag such as #ifndef BOOST_MPL_ITERATOR_RANGE_HPP_INCLUDED

velox/functions/sparksql/CMakeFiles/velox_functions_spark.dir/DecimalArithmetic.cpp.o -MF velox/functions/sparksql/CMakeFiles/velox_functions_spark.dir/DecimalArithmetic.cpp.o.d -o velox/functions/sparksql/CMakeFiles/velox_functions_spark.dir/DecimalArithmetic.cpp.o -c ../../velox/functions/sparksql/DecimalArithmetic.cpp
In file included from ../.././velox/functions/sparksql/DecimalUtil.h:25,
                 from ../../velox/functions/sparksql/DecimalArithmetic.cpp:20:
/usr/local/include/boost/multiprecision/cpp_int.hpp:1224:13: error: 'list' is not a member of 'boost::mpl'
 1224 |        mpl::list<
      |             ^~~~
/usr/local/include/boost/multiprecision/cpp_int.hpp:1224:13: note: suggested alternatives:
In file included from /opt/rh/gcc-toolset-9/root/usr/include/c++/9/list:63,
                 from _deps/folly-src/folly/system/AtFork.h:21,
                 from _deps/folly-src/folly/detail/ThreadLocalDetail.h:41,
                 from _deps/folly-src/folly/ThreadLocal.h:52,
                 from _deps/folly-src/folly/executors/GlobalThreadPoolList.h:25,
                 from _deps/folly-src/folly/executors/ThreadPoolExecutor.h:28,
                 from _deps/folly-src/folly/executors/CPUThreadPoolExecutor.h:24,

@mbasmanova
Copy link
Contributor

CC: @majetideepak @kgpai

Deepak, Krishna, any chance you can help here?

@kgpai
Copy link
Contributor

kgpai commented Jun 5, 2023

@mbasmanova Taking a look.

@kgpai
Copy link
Contributor

kgpai commented Jun 5, 2023

@jinchengchenghh Can you give more details on your local env ? Are you on linux using gcc ?
One way to replicate this locally is to :

  1. Install docker
  2. Pull the docker image used for CI (docker pull ghcr.io/facebookincubator/velox-dev:circleci-avx )
  3. Copy over velox source code to that container (docker cp /path/to/your/velox /path/in/container)
  4. Then replicate the build failure there and subsequently fix.

Feel free to reach out to me on Slack (@kpai) If you have any questions/need assistance.

@jinchengchenghh
Copy link
Contributor Author

Thank you very much. I find in my local environment BoostSource is AUTO, but in CI, it is BUDDLED, I update the build boost script ,, this problem has been fixed. @mbasmanova @kgpai

@jinchengchenghh jinchengchenghh marked this pull request as draft June 6, 2023 05:42
@jinchengchenghh jinchengchenghh marked this pull request as ready for review June 6, 2023 06:56
@jinchengchenghh
Copy link
Contributor Author

Can you help review? Thanks! @kgpai

uint64_t lo = static_cast<uint64_t>(word);
return (hi == 0) ? 64 + __builtin_clzll(lo) : __builtin_clzll(hi);
} else {
return -1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw VELOX_UNSUPPORTED instead of returning -1?

@majetideepak
Copy link
Collaborator

@jinchengchenghh Can we converge the existing DecimalUtil functionality with this? Looks like this is an improvement and could benefit Presto as well.

@jinchengchenghh
Copy link
Contributor Author

sparksql::DecimalUtil::divideWithRoundUp can divide without overflow in more cases, I'm not sure if this behavior complies with Presto. Presto may expect overflow in such cases? @majetideepak

@jinchengchenghh
Copy link
Contributor Author

jinchengchenghh commented Jun 12, 2023

Can you help review? Thanks! @mbasmanova

@majetideepak
Copy link
Collaborator

sparksql::DecimalUtil::divideWithRoundUp can divide without overflow in more cases, I'm not sure if this behavior complies with Presto. Presto may expect overflow in such cases? @majetideepak

Presto Java uses the BigInteger class for decimals. I believe it does not overflow easily as well similar to spark.
@karteekmurthys any thoughts on this?

@jinchengchenghh
Copy link
Contributor Author

jinchengchenghh commented Jun 26, 2023

maxBitsRequiredAfterScaling will check if it needs increase to int256 to guarantee not overflow, but the efficiency will be lower than before since it cannot get the accurateBitsRequiredAfterScaling. Can we accept the performance regression? If yes, I will refactor to current DecimalUtil.h @majetideepak @karteekmurthys @mbasmanova

@karteekmurthys
Copy link
Contributor

sparksql::DecimalUtil::divideWithRoundUp can divide without overflow in more cases, I'm not sure if this behavior complies with Presto. Presto may expect overflow in such cases? @majetideepak

Presto Java uses the BigInteger class for decimals. I believe it does not overflow easily as well similar to spark. @karteekmurthys any thoughts on this?

That is true. It uses BigInteger support and doesn't overflow that easily. Our decimal arithmetic operations do verify if the results overflow Decimal limits and throw exceptions if they do. You can refer unit tests and e2e for that.

return makeLongDecimalFlatVector(int128s, DECIMAL(precision, scale));
}

int128_t convertStringToInt128(const std::string& value, bool& nullOutput) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use int128_t HugeInt::parse(const std::string& str)

bool nullOutput;
testDecimalExpr<TypeKind::HUGEINT>(
makeLongDecimalFlatVector(
{convertStringToInt128("201" + std::string(31, '0'), nullOutput)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you could also replace with int128_t build(uint64_t hi, uint64_t lo) from Hugeint.h.

Understand that string form is readable but slower.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In tests, we should favor readability as top performance is not important.

int256_t inAbs = abs(in);
bool isNegative = in < 0;

uint128_t unsignResult = (inAbs & uint128Mask).convert_to<uint128_t>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible these 2 can be merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, T is int64_t while the variable type uses uint64_t, we cannot get template UT as uint64_t

Copy link
Contributor

@karteekmurthys karteekmurthys Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it require templating? You are using specific types in the functions. You could simply overload the functions.
or
May be you can still merge convert<int64_t> and convert<int128_t>.

Do

typedef typename std::conditional<std::is_same_v<TInput, int64_t>,uint64_t,__uint128_t>::type UT;

You can switch the unsigned type depending on T.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your guide, fixed.

@jinchengchenghh jinchengchenghh force-pushed the decimal_arith branch 2 times, most recently from 3460bbb to bc0363a Compare June 28, 2023 05:12
@jinchengchenghh
Copy link
Contributor Author

Do you have further comments? @karteekmurthys

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh Thank you for the contribution.

Use Arrow gandiva BasicDecimal128 algorism to compute value, it will introduce int256_t to get much bigger result scale than presto.
For some corner cases, it is an approximate value to Spark.

This description is hard to understand as it is lacking details. What is the "Arrow gandiva BasicDecimal128 algorism" ? Would you include a link and/or summary of the algorithm?

it will introduce int256_t to get much bigger result scale than presto.

What is "much bigger result scale" ? Would you clarify?

For some corner cases, it is an approximate value to Spark.

What are these corner cases? Would you provide examples and a description of these cases?

@@ -22,6 +22,14 @@ var ([[:alpha:]][[:alnum:]_]*)
"=" return Parser::token::ASSIGN;
"min" return Parser::token::MIN;
"max" return Parser::token::MAX;
"<" return Parser::token::LT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these changes for? It would be nice to extract these into a separate PR and add a test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will enhance the function signature, so @majetideepak suggests me to add it in this PR, #4773 (comment)

Do you make sure we should extract these into a separate PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Can you describe these change in the PR description? Are there tests for these? Can you give me a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/facebookincubator/velox/pull/4613/files#diff-4baee9f82f9c347f753972415d345941d2c2a110b608d480b090650171da096bR379
It is used to compute the result scale in function getResultScale

std::string getResultScale(std::string precision, std::string scale) {
  return fmt::format(
      "({}) <= 38 ? ({}) : max(({}) - ({}) + 38, min(({}), 6))",
      precision,
      scale,
      scale,
      precision,
      scale);
}

Current Decimal multiply and divide test can cover <= and ? changes, I also extend the calculation to support basic operator like <.

If the computation is wrong, we cannot get the correct result type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Let's extract these changes into a separate PR and add a test to velox/expression/tests/SignatureBinderTest.cpp to cover all of added functionality.

@@ -52,7 +52,9 @@ set(BOOST_HEADER_ONLY
crc
circular_buffer
math
mpl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you adding new dependencies to Velox? Please, update PR description to clearly state that and explain why these are needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I include this header file, so we will need it. https://github.com/facebookincubator/velox/pull/4613/files#diff-4b21d26191cbbdb9bc90cafd83aed144a4483b1c594f224cf3bbe67f401f5f83R19
#include <boost/multiprecision/cpp_int.hpp>.
It will introduce int256

inline int32_t countLeadingZeros(uint64_t word) {
return __builtin_clzll(word);
template <typename T = uint64_t>
inline int32_t countLeadingZeros(T word) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to extract this change into a separate PR. This would make this PR smaller and easier to review.

@@ -57,6 +57,12 @@ Mathematical Functions
SELECT 2L / 2L; -- 1.0
SELECT 3 / 0; -- NULL

.. spark:function:: divide(x, y) -> decimal

Returns the results of dividing x by y. The types of x and y must be decimal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you update the docs for existing "divide(x, y) -> double" to clarify what types of x and y it supports?

Can x and y be decimals with different precision / scale? It so, can you make this clear? What would be the precision / scale of the result? Please, clarify.

using uint128_t = __uint128_t;
using int256_t = boost::multiprecision::int256_t;

class DecimalUtil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, document this class and its public methods.

// bits_required(x * 10^y) <= bits_required(x) + floor(log2(10^y)) + 1
// We precompute floor(log2(10^x)) + 1 for x = 0, 1, 2...75, 76

static const int32_t floorLog2PlusOne[] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, check coding style document and update the PR to comply. Here, the naming needs to be updated.

@jinchengchenghh jinchengchenghh force-pushed the decimal_arith branch 5 times, most recently from e83dd19 to 6a73c7b Compare August 24, 2023 14:19
@jinchengchenghh
Copy link
Contributor Author

Do you have further comments? @mbasmanova

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yuhta Jimmy, would you help review and merge this PR?

@mbasmanova mbasmanova requested a review from Yuhta August 24, 2023 20:00
@jinchengchenghh jinchengchenghh force-pushed the decimal_arith branch 3 times, most recently from 718a8f1 to 90c84de Compare September 11, 2023 02:58
@jinchengchenghh
Copy link
Contributor Author

Do you have further comments? Thanks! @Yuhta

@zhouyuan
Copy link
Contributor

@Yuhta gentle ping, this patch is the base of decimal related functions for Spark SQL.

thanks, -yuan

Copy link
Contributor

@Yuhta Yuhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh Looks good overall. Can you rebase and run the new functions through Spark function fuzzer to make sure there is no corner case bugs?

@jinchengchenghh
Copy link
Contributor Author

Rebased and passed spark fuzzer test, can you help merge it? Thanks! @Yuhta

@facebook-github-bot
Copy link
Contributor

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@kgpai merged this pull request in 68b4698.

@conbench-facebook
Copy link

Conbench analyzed the 1 benchmark run on commit 68b4698c.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decimal function signature binder constraint cannot work in complex computation
8 participants